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DETAILED ACTION 

1 . This communication is in response to the Arguments and Amendments filed on 
01/05/2009. Claims 1-15 remain pending and have been examined. The Applicants' 
amendment and remarks have been carefully considered, but they do not place the 
claims in condition for allowance. Accordingly, this Action has been made FINAL. 

2. All previous objections and rejections directed to the Applicant's disclosure and 
claims not discussed in this Office Action have been withdrawn by the Examiner. 

Response to Amendment 

3. Applicants' amendments filed on 01/05/2009 have been fully considered. The 
newly amended limitations in claims 1 , 6, and 1 1 necessitate new grounds of rejection. 
Specifically, the newly added limitation of "subsequent to grouping the plurality of 
phoneme clusters" necessitate new grounds for rejection. 

Response to Arguments 

4. Applicant's arguments in the Arguments (pages 13-17) filed on 01/05/2009 with 
regard to claims 1-15, regarding the rejections under 35 USC 103, have been fully 
considered but they are moot in view of new grounds for rejection 

The rejections with regards to the 35 USC 101, pertaining to claims 11-15, have 
been withdrawn in view of the amendments to the claims and Specification that were 
submitted. However, the arguments presented pertaining to claims 1-15 are not 
persuasive. The Applicants argue that the machine-transformation text as outlined by 
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the Supreme Court is the proper test to apply. In view of this argument, the Applicants 
argue that the received speech signal are transformed into a plurality of phoneme 
clusters and thus maintain that such test is fulfilled. The Examiner respectfully disagrees 
with this assertion. To begin with, the machine-transformation test applies to a process 
claim, and such decision was made subsequent to the mailing of the Office Action on 
09/04/2008, where the ruling was made on 10/30/2008. The machine-transformation 
test is not fulfilled by the claim. The claim is not tied to specific hardware, such as a 
processor and does not result in a physical transformation of one article into another 
state or thing. Speech signals are not articles since they are not capable of being 
reduced alone. Further, claims 6 and 1 1 still are pertinent to the "useful, concrete, and 
tangible result" since such limitations are related to an abstract idea of grouping 
phoneme clusters where such grouping cannot be realized in tangible form. Hence, the 
101 rejection is maintained. 

Claim Rejections - 35 USC § 101 

5. 35 U.S.C. 101 reads as follows: 

Whoever invents or discovers any new and useful process, machine, manufacture, or composition of 
matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the 
conditions and requirements of this title. 

Claims 1-15 are directed toward non-statutory subject matter. 

Claims 1-15 are directed towards a method for recognizing an input speech of a 
word sequence. To be statutory, a claimed process must either: (A) result in a physical 
transformation for which a practical application is either disclosed in the specification or 
would have been known to a skilled artisan (B) be limited to a practical application 
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which produces a useful, tangible, and concrete result. See Diehr, 450 U.S. at 183-84, 
209 USPQ at 6 (quoting Cochrane v. Deener, 94 U.S. 780, 787-88 ("A [statutory] 
process is a mode of treatment of certain materials to produce a given result. It is an 
act, or a series of acts, performed upon the subject matter to be transformed reduced to 
a different state or thing.... The process requires certain things should be done with 
certain substances, and in a certain order; but the tools to be used in doing this may be 
of secondary consequence."). In the present, case claims 1 , 6, and 1 1 refers to an 
algorithm for receiving speech and grouping phoneme clusters, there is no output being 
defined that causes there to be a tangible result and there is no physical transformation 
of the input sequence of words. Rather, there appears to be an algorithm grouping 
phonemes in clusters and determining whether a second cluster is needed.. Since the 
presently claimed invention neither performs a transformation, nor actively produces a 
useful, concrete and tangible result claims, 16, 23 ,and 24 are directed towards non- 
statutory subject matter. 

As such, claims 1 , 6, and 1 1 are directed towards nom-statutory subject matter. 
The dependent claims 2-5, 7-10, and 12-1 5 fail to overcome 35 U.S.C. 101 rejection 
directed towards independent claim 1 , 6, and 1 1 . 



Claim Rejections - 35 USC § 103 

6. The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
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such that the subject matter as a whole would have been obvious at the time the invention was made to a 
person having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived 
by the manner in which the invention was made. 

7. Claims 1-15 are rejected under 35 U.S.C. 103 (a) as being unpatentable over 
Kao et al. (U.S. Patent: 6,317,712 B1), hereinafter referred as Kao, in view of Hwang et 
al. (US 6,141,641), hereinafter referred as Hwang. 

As per claims 1 and 1 1 , Kao teaches a speech processing method comprising: 
receiving speech signals (Kao, figure 3, subblock 11, collects speech 

data); 

processing the received speech signals (Kao, figure 3, subblock 12 and 

13, training of triphone models); 

to generate a plurality of phoneme clusters (Kao, figure 3, subblock 14, 
clustering of triphone by decision tree); 

grouping the plurality of phoneme clusters into a first cluster node and a 
second cluster node, wherein the first cluster node comprises at least one 
phoneme cluster from the plurality of phoneme clusters (Kao, figure 3, subblock 

14, clustering triphones; figure 4, figure of tree shown and see col. 3, lines 47-55, 
clustering is done initially with all phoneme models and cluster according to a 
yes/no relationship.). 

and a likelihood increase criterion (see col. 6, lines 58-64-67, where the 
nodes are split based on likelihood improvement). 

Kao does not explicitly teach determining subsequent to grouping the 
plurality of phoneme clusters if a phoneme cluster in the first cluster node is to be 
moved into the second cluster node based on a likelihood increase of the phone 
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cluster of the first cluster node belonging to the second cluster node instead of 
belonging to the first cluster node. 

However, Hwang et al. teaches determining subsequent to grouping the 
plurality of phoneme clusters (see col. 8, lines 20-28 and col. 9, lines 13-17, 
where senones (previously grouped in tree) are selected) when the at least one 
phoneme cluster in the first cluster node is to be moved into the second cluster 
node (see col. 9, lines 30, where parameters from the deep senones are merged 
based on a criterion) based on a likelihood increase of the phone cluster in the 
first cluster node belonging to the second cluster node instead of belonging to the 
first cluster node (see col. 9, lines 21-25, where clustering is performed if the 
reduction of likelihood is least). 

Kao and Hwang are analogous art because they are from a similar field of 
endeavor in speech processing and large vocabulary speech recognition 
applications. Thus, it would have been obvious to one of ordinary skill in the art at 
the time the invention was made to modify the speech processing as taught by 
Kao with the determination when one cluster being moved into another cluster as 
taught by Hwang for the purpose of reducing resources allocated for speech 
recognition (see col. 9, lines 1-6) thus allowing resources to be used elsewhere. 
As to claim 1 1 , the limitations in this claim are similar in scope to claim 1 and are 

rejected and further Kao teaches the machine-readable medium (see col. 2, lines 20-21 , 

CD-ROM and software). 
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As per claims 2 and 12, Kao, in view of Hwang, teaches the speech processing 
method as claimed in claim 1 . 

Furthermore, Hwang teaches moving the at least one phoneme cluster in 
the first cluster node into the second cluster node responsive to the 
determination subsequent to grouping the plurality of phoneme clusters (see col. 
8, lines 20-28 and col. 9, lines 13-17, where senones (previously grouped in tree 
and see col. 9, lines 30, where parameters from the deep senones are merged 
based on a criterion). 

As per claims 3 and 13, Kao, in view of Hwang, teaches the speech processing 
method as claimed in claim 2. 

Furthermore, Hwang teaches moving the at least one phoneme cluster in 
the first cluster node into the second cluster node when the most likelihood 
increase is more than a threshold value (see col. 9, lines 21-25, where clustering 
is performed if the reduction of likelihood is least and see col. 11, lines 51 , 
equation 12, where the reduction is greater than 0) 

As per claims 4 and 14, Kao, in view of Hwang, teaches the speech processing 
method as claimed in claim 1 . 

Furthermore, Kao teaches wherein the phoneme clusters are triphone 
clusters based on a hidden markov model (HMM) (see col. 3, line 41; "Applicants 
teach to tie triphone HMMs"). 
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As per claims 5 and 15, Kao, in view of Hwang, teaches the speech processing 
method as claimed in claim 1 . 

Furthermore, Kao teaches grouping the triphone clusters according to 
answers to best phonetic context based questions related to the triphone clusters 
(see col. 3, lines 47-60, where clustering is performed based on yes/no 
questions). 

As per claim 6, Kao teaches a speech processing system comprising: 

an input to receive speech signals (Kao, figure 1 , subblock MIC, figure 2, 

subblock MIC); 

a processing unit (see col. 2, lines 18-20, PC) to: 

process received speech signals (Kao, figure 3, subblock 12 and 

13, training of triphone models); 

generate a plurality of phoneme clusters (Kao, figure 3, subblock 

14, clustering of triphone by decision tree); 

group the plurality of phoneme clusters into a first cluster node and 
a second cluster node, wherein the first cluster node comprises at least 
one phoneme cluster from the plurality of phoneme clusters (Kao, figure 3, 
subblock 14, clustering triphones; figure 4, figure of tree shown and see 
col. 3, lines 47-55, clustering is done initially with all phoneme models and 
cluster according to a yes/no relationship.); 
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and a likelihood increase criterion (see col. 6, lines 58-64-67, where 

the nodes are split based on likelihood improvement). 

Kao does not explicitly teach determining subsequent to grouping the 
plurality of phoneme clusters if a phoneme cluster in the first cluster node is to be 
moved into the second cluster node based on a likelihood increase of the phone 
cluster of the first cluster node belonging to the second cluster node instead of 
belonging to the first cluster node. 

However, Hwang et al. teaches determining subsequent to grouping the 
plurality of phoneme clusters (see col. 8, lines 20-28 and col. 9, lines 13-17, 
where senones (previously grouped in tree) are selected) when the at least one 
phoneme cluster in the first cluster node is to be moved into the second cluster 
node (see col. 9, lines 30, where parameters from the deep senones are merged 
based on a criterion) based on a likelihood increase of the phone cluster in the 
first cluster node belonging to the second cluster node instead of belonging to the 
first cluster node (see col. 9, lines 21-25, where clustering is performed if the 
reduction of likelihood is least). 

Kao and Hwang are analogous art because they are from a similar field of 
endeavor in speech processing and large vocabulary speech recognition 
applications. Thus, it would have been obvious to one of ordinary skill in the art at 
the time the invention was made to modify the speech processing as taught by 
Kao with the determination when one cluster being moved into another cluster as 
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taught by Hwang for the purpose of reducing resources allocated for speech 
recognition (see col. 9, lines 1-6) thus allowing resources to be used elsewhere. 

As per claim 7, Kao, in view of Hwang, teaches the speech processing system as 
claimed in claim 6. 

Furthermore, Hwang teaches moving the at least one phoneme cluster in 
the first cluster node into the second cluster node responsive to the 
determination subsequent to grouping the plurality of phoneme clusters (see col. 
8, lines 20-28 and col. 9, lines 13-17, where senones (previously grouped in tree 
and see col. 9, lines 30, where parameters from the deep senones are merged 
based on a criterion). 

As per claim 8, Kao, in view of Hwang, teaches the speech processing system as 
claimed in claim 7. 

Furthermore, Hwang teaches moving the at least one phoneme cluster in 
the first cluster node into the second cluster node when the most likelihood 
increase is more than a threshold value (see col. 9, lines 21-25, where clustering 
is performed if the reduction of likelihood is least and see col. 11, lines 51 , 
equation 12, where the reduction is greater than 0) 

As per claim 9, Kao, in view of Hwang, teaches the speech processing system as 
claimed in claim 6. 
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Furthermore, Kao teaches wherein the phoneme clusters are triphone 
clusters based on a hidden markov model (HMM) (see col. 3, line 41; "Applicants 
teach to tie triphone HMMs"). 

As per claim 10, Kao, in view of Hwang, teaches the speech processing system 
as claimed in claim 9. 

Furthermore, Kao teaches grouping the triphone clusters according to 
answers to best phonetic context based questions related to the triphone clusters 
(see col. 3, lines 47-60, where clustering is performed based on yes/no 
questions). 

Conclusion 

8. Applicant's amendment necessitated the new ground(s) of rejection presented in 
this Office action. Accordingly, THIS ACTION IS MADE FINAL. See M PEP 
§ 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 
CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1 .136(a) will be calculated from the mailing date of 
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the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the date of this final action. 

9. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. 

Junqua (US 5,806,030) is cited to disclose clustering phone models. Komori (US 
6,108,628) is cited to disclose reclustering of speaker models. Aubert (US 6,339,759) is 
cited to disclose acoustic model determination based on grouping. Rigazio (US 
6,526,379) is cited to disclose clustering methods for speech recognition. 

Chou et al. ("High Resolution Decision Tree Based Acoustic Modelling Beyond 
CART") Is cited to disclose decision tree clustering based on generating likelihoods 
using multiple Gaussian mixtures. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to PARAS SHAH whose telephone number is (571)270- 
1650. The examiner can normally be reached on MON.-THURS. 7:00a. m.-4:00p.m. 
EST. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Patrick Edouard can be reached on (571)272-7603. The fax phone number 
for the organization where this application or proceeding is assigned is 571-273-8300. 
Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published 
applications may be obtained from either Private PAIR or Public PAIR. Status 
information for unpublished applications is available through Private PAIR only. For 
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more information about the PAIR system, see http://pair-direct.uspto.gov. Should you 
have questions on access to the Private PAIR system, contact the Electronic Business 
Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO 
Customer Service Representative or access to the automated information system, call 
800-786-9199 (IN USA OR CANADA) or 571-272-1000. 

/P. S./ 

Examiner, Art Unit 2626 

01/23/2009 

/Patrick N. Edouard/ 

Supervisory Patent Examiner, Art Unit 2626 



